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Dangers of Unphysical Regions 
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Abstract 

We discuss the appearance of negative numbers of events in radiochemical 
experiments and negative squared antineutrino mass m| in tritium beta de- 
cay. Going beyond the standard discussion about how to extract upper limits 
in those cases, we show that the problem is much more profound. We explain 
the circumstances which are the likely cause of the persistently negative values 
of nip in all modern tritium beta decay experiments. 
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Introduction. 

A problem in data analysis which turns up occasionally in some experiments 
and notoriously in others is that of a parameter estimate occurring outside the 
theoretically allowed physical range of the parameter. The common attitude to 
this is that one may use the unphysical result to set a limit inside the physical 
range. If one assumes that a known, e.g. Gaussian, probability density function 
can be centered on the measured value found in the unphysical range, there exists 
a standard prescription for translating a confidence interval for the measured value 
into a confidence interval for the true value jy, 0- The discussion then generally 
turns around whether one should use the full Gaussian for the confidence interval, 
or only the part of the Gaussian tail which is inside the physical range, properly 
renormalized @, ||. [| . 

For this discussion we shall use two well known examples where the problem with 
unphysical regions appear. The first example is the distinction of a positive signal 
over background, where both signal and background consist of countable events. 
Our second example is the case of neutrino mass determination from the shape of 
the electron energy spectrum in tritium /3-decay. The neutrino mass example was 
also used in refs. 0, |3|, where it was assumed that a Gaussian could meaningfully 
be centered on the estimated value in the unphysical range. In this paper we shall 
argue that the problem is more profound, and that the data analysis should be done 
differently. 

Radiochemical experiments. 

Consider the detection of radiatively decaying atoms in a radiochemical experi- 
ment, such as the solar neutrino experiments f| ||, 0]. The radioactive atoms have 
been produced by exposure to the neutrino radiation from the sun, and the total 
number of atoms collected gives information about the solar neutrino flux. The 
atoms are counted when the radioactive decay takes place in a detector with well 
known sensitivity, and the signature of these events is that they occur with a well 
known time constant A. There are no other characteristics permitting the distinction 
of signal events from background events occurring randomly in time. The estimator 
of the sum of neutrino flux and background is thus a Poisson-distributed discrete 
random variable. It is essential to note that the number of counts is very small 
during a period of observation, a "run", sometimes zero. The general procedure for 
how to analyze such data has been well described by Cleveland || . 

The signal events are defined to occur with a time distribution exp(— Xt), and 
the background is assumed constant in time. Thus the occurence probability per 
unit time is given by 

fit- a, b) = ae~ xt + b , (1) 
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where a and b are the signal and background intensities, respectively. Since both 
the signal and the background are Poisson-distributed non-negative numbers the 
physical region is defined by a and b both being non-negative. 

The estimators a and b are found by maximizing the likelihood function (or 
minimizing the negative of the log-likelihood function) for iV observations of one 
event, each at a time t iy i = 1...N, 

N 

C(t u ...,t N \a,b)=e- aA / x - bT l[(ae-^+b) , (2) 

where T is the total observation time, and A is the sum of the N time intervals 
weighted by A, 

TV 

A = Y^e- ux - e- u - lX . (3) 
i=i 

Usually a and b are not required to be separately non-negative. If a and b are signif- 
icantly positive (as in the total fits to a large number of independent experimental 
"runs") this causes no problem. 

Consider, however, the situation in an indvidual run when the number of signal 
events is zero or nearly zero. If a is not required to be non-negative in the fit, the 
maximum likelihood may occur for a value of a which is negative as a result of a 
pure statistical fluctuation. 

Or else, there may be an unknown process contributing to the background so 
that the data happens to exhibit a component increasing in time, for instance a late 
accumulation of events which simulates an intensity increasing with time. Since this 
is not taken care of by the b assumed constant, this is misinterpreted clS Su SI ernal 
having the form f(t; —a,b). In other words, the background hypothesis is wrong. 

Since there is no physical theory in an unphysical region, f(t; —a, b) is arbitrary, 
and could often be replaced by some other arbitrary continuation. Thus if the 
likelihood function has a deeper extremum for some negative a, this fact cannot 
be used to make confidence statements about a in the physical region, because the 
choice of another continuation might yield a negative a corresponding to a different 
extremum. In any case one should bear in mind that the information obtained using 
f(t; —a, b) is an information on the background, not on a. 

A further problem is that f(t; —a, b) is not a normalizable probability distribu- 
tion. It therefore does not make sense to describe the data by a Gaussian centered 
on —a, and to make inferences from its tail in the physical region. 

The only inference one can make about a negative signal is that the hypothesis 
([[]) is wrong, notably that the background is not well described by a constant. To 
prove that the hypothesis ([]]) is wrong one has to make a goodness-of-fit test in the 
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physical region, for instance at a = 0, not a test which makes use of the arbitrary 
and ill-defined likelihood in the unphysical region. Unfortunately the widely used 
chisquare test is not reliable for very small numbers of events. 

If the extremum is on the edge of the allowed parameter range there may be a 
problem with the convergence of numerical search algorithms such as those used in 
MINUIT ||. Note that the hypothesis used in an unphysical region may influence 
the value of a in the nearby physical region, because the search algorithms make 
finite steps around the extremum. 

The distribution of a consistent estimator converges in the (weak) limit of infi- 
nite volume of the sample to the delta function for the true value. An unbiassed 
estimator, repeatedly measured, returns values on both sides of the true value while 
converging towards it. However, when the true value is exactly zero (an academic 
case) and the estimator is a Poisson-distributed random variable restricted to integer 
non-negative values, the measurement process can only converge from the positive 
side, thus it appears biassed. But this is unavoidable, because mathematical statis- 
tics does not admit any estimator for negative values of a Poisson variable. If instead 
one does use an ad hoc estimator, not prescribed by the physical theory, which can 
take values on both sides of zero, the gain in apparent unbiassedness is obtained at 
the price of arbitrariness. 

Let us return to the case when a large number of independent experimental runs 
are done. The discussion has generally concerned the question whether a should be 
constrained to be non-negative in the individual runs before averaging. The answer 
to this is clear: if the results are non-negative on average, one should not constrain 
the individual runs because that would bias the final result towards higher values of 
a. Although this is not wrong, it implies the loss of real experimental information. 
The advisable procedure is to fit all the runs simultaneously with one common 
parameter a and with (if necessary) different background parameters bj for each run 
j. When the ensuing average a is in the physical region, no error in procedure would 
have occurred, because also the runs which individually yielded negative signals 
would be contributing their share to the total likelihood at the common best value 
of a in the physical region. This is the procedure used for instance by GALLEX |J . 

The neutrino mass problem. 

Consider now the case of the electron antineutrino mass determination from the 
shape of the electron energy spectrum in tritium /3-decay. The spectrum given by 
Fermi theory is 

f(E\A, m 2 , E , b) = ApE F(E, E ) (E - E) [(E — E) 2 — m 2 }^ + b . (4) 

F(E, Eq) is called the Fermi function (see e.g. ref. 0), p is the electron momentum, 
and the root is taken to be real and positive. Note that the e-antineutrino mass m 
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enters only in the form m 2 . The other parameters to be determined by the fit are the 
total decay energy E , a positive normalization constant A, and the background b 
assumed constant. (Sometimes the background is described by an empirical function, 
e.g. b + cE |12| , but this does not affect our general argumentation.) 

The physical region is defined by conditions on m 2 and Eq. Clearly m 2 must be 
non-negative. From the factor (E — E) we see that the theoretical electron spectrum 
would be negative for energies E > E , and the square root becomes imaginary for 
E > Eq — m. Oddly enough, the square root term in Eq. (Q) is positive and real for 
all energies E < Eq if m 2 were allowed to be negative. This curious fact is the cause 
of problems encountered in all fits, as has been pointed out before , and as we 
shall discuss below. 

The theory of beta decay does prescribe the electron energy spectrum to have 
the form but the atomic physics or molecular physics of the tritium compound 
might cause modifications. The Fermi function depends on approximations which 
we shall not discuss here. The experimental resolution function has to be folded 
into the final spectrum, and all possible instrumental distortions might not be well 



understood. In any case, the assumption [|T(], |TTJ that the response (resolution) 
function of the /3-spectrometer is Gaussian can not be rigorously proved for the tail 
of the response function. 

Thus the form of the signal is not well known and the hypothesis (Q) as well as 
the assumption for the background may be wrong. An accumulation of events at 
energies near or beyond Eq have been reported |]T0 , |TTH indicating problems with 
Eq. (||). The experimental background is not due to tritium /3-decay, and thus its 
distribution does not respect the kinematical limit E < E Q — m. 

The estimators m 2 and Eq are random variables which depend on the data and 
the method, e.g. in all recent experimental works (see H and references therein, and 



refs. [|10| , pj| ) the method is least squares minimization, the "chisquare" method [ij. 

The comments we made on the unphysical region in the radiochemical case are 
mostly the same here, but there are important differences due to the more com- 
plicated mathematical form of the "signal". From the expression (|]) we obtain for 
E = Eq — mifm^O 

df((E -m)\A,m 2 ,E ,b)/dm 2 = oo . (5) 

Moreover, taking into account that the Fermi function E(E, Eq) has no singu- 
larity at E = E — m we obtain, if m ^ 0, from the expression (f|) for E = E — m 
and any n = 1, 2, ... 

\d n f{{Eo-m)\A,m 2 ,Eo,b)/dE n \ = oo . (6) 

Thus there is no analytic continuation of the theoretical "signal" (|4]) from the phys- 
ical region into the unphysical region. It follows that it does not make sense to 
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describe the data by a Gaussian centered at — m 2 , and make inferences from its tail 
in the physical region. 

And even if it did make sense (putting in an empirical analytic continuation "by 



hand" [10|, |TT] and thereby changing the given problem of estimating the physical 
nonnegative m 2 ), one cannot derive a confidence limit on m from — m 2 ||. 

There are two possible reasons for obtaining a negative m 2 in a numerical search 
for minimum chisquare. The first reason is trivial, and not of statistical nature: 
there is a real accumulation of events near the end point E = Eq — m caused by 
instrumental problems or unknown physics or a wrong background hypothesis. Since 
Eq. (Hj) does not account for this, it must be modified. 

The conclusion is then the same as in the radiochemical case. A goodness-of-fit 
test of the hypothesis @ should be made in the physical region, not at — m 2 . There 
is again the caveat about the chisquare method not being a suitable test in a region 
with very low statistics. 

The second reason is quite non-trivial [13|. No tritium decay electron can have 
an energy E > E Q — m, so data obtained in that range must be background. Yet 
it is impossible to restrict the fit of Eq. (f|) to the physical range E < Eq — m and 
replacing it in the E > Eq — m range by a pure background term, because Eq is an 
unknown and itself a parameter to be determined in the fit. And even if we knew 



from theory the approximate value of Eq (as was assumed in [II]), the problem is 
the same, because we do not know the antineutrino mass. 

This problem was already understood by the writers of MINUIT, who built 
an explicit check into the program, prohibiting the user from supplying different 
functions in different parameter ranges if the ranges depend on one of the parameters 
to be estimated ||. 

Therefore, it is almost unavoidable that the fit covers data in some part of the 
unphysical range E > Eq — m, where the numerical search must make steps into 
the region of negative m 2 in order to keep the root in (Q) real. And, as already 
mentioned in the radiochemical case, search algorithms like those used in MINUIT 
only converge if they can make finite steps around the extremum, thus stepping 
Eq occasionally into the range Eq < E + m. The convolution of the energy resolution 
function into the spectrum may also be a cause for excursions into this unphysical 
range. 

Of course, this situation is avoidable if one has sufficiently good estimates for Eq 
and m to be able to restrict the fit to experimental data in the range E <C (Eq — m) 
far from the unphysical region, as were done in the past (see |R| and references 
therein). Although one then obtains non- negative estimates for m 2 , the resulting 
statistical errors (dispersion) are big since the /3-spectrum for E far from E = Eq—itl 
is very insensitive to the unknown antineutrino mass. 
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We believe that the described circumstances are the main reason why all modern 
experiments [EL [TO, O have obtained results in the unphysical range m 2 < 0. We 



are currently carrying out simulations to demonstrate this quantitatively [TH], but 



it would still be preferrable if the experimental groups would confirm this on real 
data. 

Conclusions. 

For both cases studied, the determination of the solar neutrino flux by radio- 
chemical methods and the determination of the electron antineutrino mass from 
the end point of the electron spectrum in tritium /3-decay, the conclusions are the 
same. An analysis which yields a signal estimate which is zero or negative obviously 
tells that there is no signal, only background. Thus the information obtained from 
the unphysical region with a negative signal is not an information on the signal 
parameter, but on the background. 

In the radiochemical case the functional form of the signal in the unphysical 
region is an increasing function which approaches a constant. If the background 
happens to have this form, for physical reasons or because of statistical fluctuations, 
a negative signal is the apparent result. The same is true in the electron antineutrino 
mass case but more importantly, the squared antineutrino mass is forced to be 
negative by attempts to fit the electron spectrum and the background in the energy 
region beyond the physical end point of the spectrum. 

In both fit which is better in the unphysical region than in the physical 

region is an accident, because the function which is correct in the physical region 
is arbitrary in the unphysical region, and could often be replaced by some other 
arbitrary continuation. Thus an extremum of the likelihood function in the unphys- 
ical region cannot be used to make confidence statements about the parameter in 
the physical region, because the choice of another arbitrary function would yield a 
different result. 

In particular it is impossible to obtain the confidence statement about the real 
antineutrino mass from a negative unphysical estimator, m 2 < ||. All one can 
conclude (barring mistakes on the functional form of the background) is that m^O, 
because if it were zero, there would not be any square root in Eq. (01), and nothing 



would force the fit into the region of negative m 2 [|13| . 

To decide whether the function composed of signal plus background is an ade- 
quate description of the data, one has to make a goodness-of-fit test in the physical 
region. It is misleading to make a likelihood ratio test comparing the likelihood in 
an unphysical region with the likelihood in the physical region, because the former is 
meaningless and the functions are not the same. It is doubtful whether the chisquare 
method is a suitable goodness-of-fit test near the edge of a physical region, because 
chisquare is not reliable for small samples. The resolution of this problem is to find 
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other statistical tests which are adapted to the situation. 

We have shown that there are circumstances which cause a fit to the electron 
spectrum in tritium /3-decay to yield a negative m 2 value. Since these circumstances 



are present in all modern analyses [§], [10], we believe we have found the reason 
for the anomalous results. 

We thank Fred James, CERN, for useful comments and valuable criticism. 
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